Digitization of Text Documents Using PDF/A

نویسندگان

چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using Fuzzy LR Numbers in Bayesian Text Classifier for Classifying Persian Text Documents

Text Classification is an important research field in information retrieval and text mining. The main task in text classification is to assign text documents in predefined categories based on documents’ contents and labeled-training samples. Since word detection is a difficult and time consuming task in Persian language, Bayesian text classifier is an appropriate approach to deal with different...

متن کامل

Digitization Errors In Hungarian Documents

Our task was to analyze a certain digitizing system, check what type of errors emerge during the process, and how these errors effect the searchability of the digitized documents. We have set up a testbed which is suitable for the automatic processing of digitized texts in a large scale. In this paper we shortly introduce the methodology of document digitization emphasizing the error-sources in...

متن کامل

Using Fuzzy LR Numbers in Bayesian Text Classifier for Classifying Persian Text Documents

متن کامل

Text Summarization Using XML-Tagged Documents

CL Research’s participation in the Document Understanding Conference extended the framework used in the TREC 2003 question-answering track, in which texts are parsed and processed into XML-tagged documents where sentence elements are marked with discourse, syntactic, and semantic attributes. This extension was made primarily to test the viability of using XML-tagged documents for summarization....

متن کامل

using fuzzy lr numbers in bayesian text classifier for classifying persian text documents

text classification is an important research field in information retrieval and text mining. the main task in text classification is to assign text documents in predefined categories based on documents’ contents and labeled-training samples. since word detection is a difficult and time consuming task in persian language, bayesian text classifier is an appropriate approach to deal with different...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Information Technology and Libraries

سال: 2018

ISSN: 2163-5226,0730-9295

DOI: 10.6017/ital.v37i1.9878